← Back shared

Introduction

I'm currently working on small language models, especially Llama. Since I want to serve the model using LM Studio, I need to convert it into GGUF format

.pth (Meta Checkpoint)
  ↓ convert_llama_weights_to_hf.py
HF directory (config.json, tokenizer.model, .safetensors)
  ↓ convert-hf-to-gguf.py
GGUF file (compatible with llama.cpp / LM Studio)

Requirement

pip install torch numpy sentencepiece transformers huggingface_hub gguf

Llama Datafile

We can download Llama from Class Leading, Open-Source AI | Download Llama For CodeLlama, you can use this link: meta-llama/codellama After finishing the download, you can see these files in the directory - consolidated.pth, consolidated.01.pth, etc. - Model weight data - Sharded (e.g., 13B model has 2 shards, 65B has 8) - .pth contains raw tensor data only, no model architecture - params.json - Model hyperparameter settings (e.g., hidden size, vocab size, num_layers) - Used to generate HF config during conversion - tokenizer.model - SentencePiece tokenizer model (used in all LLaMA variants) - Tokenisation schema - Converted to tokenizer.json, tokenizer_config.json in HF format

PyTorch .pth → Hugging Face Format Conversion

Then change .pth to the Hugging Face format. Because to change .pth to GGUF files, we should extract HF format metadata. Since my disk usage is limited, I also need to save my models in Hugging Face

python convert_llama_weights_to_hf.py \
  --input_dir /path/to/CodeLlama-13B \
  --model_size 13B \
  --output_dir ./hf-codellama13b

This script (within transformers) converts Meta's CodeLlama .pth + tokenizer into Hugging Face sharded .safetensors.

Hugging Face → GGUF Conversion

cd llama.cpp
python llama.cpp/convert_hf_to_gguf.py CodeLlama-13b/output --outfile codellama13b.gguf --outtype f32

How to Upload Model in to Hugging Face

❯ cd CodeLlama-7b
❯ huggingface-cli upload Berom0227/codellama-7b --type model
huggingface-cli: error: unrecognised arguments: --type model

❯ huggingface-cli repo create Berom0227/codellama-7b --type model
The --type argument is deprecated and will be removed in a future version. Use --repo-type instead.
Successfully created Berom0227/codellama-7b on the Hub.
Your repo is now available at https://huggingface.co/Berom0227/codellama-7b

❯ huggingface-cli upload Berom0227/codellama-7b . --include='*'

Hugging Face CLI Commands Explanation